Oct 24, 2022
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
import geopandas as gpd
Two case studies:
Source: https://earthobservatory.nasa.gov/world-of-change/AralSea
Source: https://earthobservatory.nasa.gov/images/76327/lake-orumiyeh-iran
Source: https://earthobservatory.nasa.gov/images/91921/disappearing-walker-lake
import intake
import xarray as xr
import cartopy.crs as ccrs
# hvplot
import hvplot.xarray
import hvplot.pandas
import holoviews as hv
import geoviews as gv
from geoviews.tile_sources import EsriImagery
/Users/nhand/mambaforge/envs/musa-550-fall-2022/lib/python3.9/site-packages/dask/dataframe/backends.py:189: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead. _numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index) /Users/nhand/mambaforge/envs/musa-550-fall-2022/lib/python3.9/site-packages/dask/dataframe/backends.py:189: FutureWarning: pandas.Float64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead. _numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index) /Users/nhand/mambaforge/envs/musa-550-fall-2022/lib/python3.9/site-packages/dask/dataframe/backends.py:189: FutureWarning: pandas.UInt64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead. _numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index)
url = 'https://raw.githubusercontent.com/pyviz-topics/EarthML/master/examples/catalog.yml'
cat = intake.open_catalog(url)
list(cat)
['landsat_5_small', 'landsat_8_small', 'landsat_5', 'landsat_8', 'google_landsat_band', 'amazon_landsat_band', 'fluxnet_daily', 'fluxnet_metadata', 'seattle_lidar']
We'll focus on the Landsat 5 and 8 small datasets.
These are "small" snapshots around Walker Lake, around cut out of the larger Landsat dataset.
landsat_5 = cat.landsat_5_small()
landsat_5
landsat_5_small:
args:
chunks:
band: 1
x: 50
y: 50
concat_dim: band
storage_options:
anon: true
urlpath: s3://earth-data/landsat/small/LT05_L1TP_042033_19881022_20161001_01_T1_sr_band{band:d}.tif
description: Small version of Landsat 5 Surface Reflectance Level-2 Science Product.
driver: intake_xarray.raster.RasterIOSource
metadata:
cache:
- argkey: urlpath
regex: earth-data/landsat
type: file
catalog_dir: https://raw.githubusercontent.com/pyviz-topics/EarthML/master/examples
plots:
band_image:
dynamic: false
groupby: band
kind: image
rasterize: true
width: 400
x: x
y: y
Leveraging the power of hvplot under the hood...
landsat_5.hvplot.band_image()
landsat_5.hvplot.image(
x="x",
y="y",
groupby="band",
rasterize=True,
frame_width=400,
frame_height=400,
geo=True,
crs=32611,
)
landsat_8 = cat.landsat_8_small()
landsat_8.hvplot.image(
x="x",
y="y",
groupby="band",
rasterize=True,
frame_width=400,
frame_height=400,
geo=True,
crs=32611,
)
This will return an xarray DataArray where the values are stored as dask arrays.
landsat_5_da = landsat_5.to_dask()
landsat_8_da = landsat_8.to_dask()
landsat_5_da
<xarray.DataArray (band: 6, y: 300, x: 300)>
array([[[ 640., 842., 864., ..., 1250., 929., 1111.],
[ 796., 774., 707., ..., 1136., 906., 1065.],
[ 975., 707., 908., ..., 1386., 1249., 1088.],
...,
[1243., 1202., 1160., ..., 1132., 1067., 845.],
[1287., 1334., 1292., ..., 801., 934., 845.],
[1550., 1356., 1314., ..., 1309., 1636., 1199.]],
[[ 810., 1096., 1191., ..., 1749., 1266., 1411.],
[1096., 1048., 905., ..., 1556., 1217., 1411.],
[1286., 1000., 1286., ..., 1749., 1604., 1411.],
...,
[1752., 1565., 1566., ..., 1502., 1456., 1078.],
[1752., 1799., 1706., ..., 983., 1172., 1077.],
[1893., 1753., 1754., ..., 1736., 2250., 1736.]],
[[1007., 1345., 1471., ..., 2102., 1462., 1719.],
[1260., 1259., 1175., ..., 1847., 1419., 1719.],
[1555., 1175., 1555., ..., 2059., 1889., 1760.],
...,
...
...,
[2429., 2138., 2041., ..., 2175., 1885., 1301.],
[2381., 2333., 2382., ..., 1204., 1495., 1301.],
[2478., 2041., 2333., ..., 2755., 3431., 2223.]],
[[1819., 2596., 2495., ..., 2429., 1785., 2023.],
[2259., 2359., 1885., ..., 2158., 1684., 1921.],
[2865., 2291., 2664., ..., 2057., 1955., 2057.],
...,
[3081., 2679., 2612., ..., 2499., 2098., 1395.],
[2779., 2544., 2779., ..., 1429., 1596., 1496.],
[3183., 2309., 2679., ..., 3067., 3802., 2665.]],
[[1682., 2215., 2070., ..., 2072., 1440., 1780.],
[1876., 1973., 1633., ..., 1926., 1586., 1635.],
[2409., 1924., 2215., ..., 1829., 1780., 1829.],
...,
[2585., 2296., 2296., ..., 2093., 1710., 1134.],
[2393., 2344., 2489., ..., 1182., 1374., 1326.],
[2826., 2007., 2393., ..., 2860., 3724., 2333.]]])
Coordinates:
* band (band) int64 1 2 3 4 5 7
* y (y) float64 4.309e+06 4.309e+06 4.309e+06 ... 4.264e+06 4.264e+06
* x (x) float64 3.324e+05 3.326e+05 3.327e+05 ... 3.771e+05 3.772e+05
Attributes:
transform: (150.0, 0.0, 332325.0, 0.0, -150.0, 4309275.0)
crs: +init=epsg:32611
res: (150.0, 150.0)
is_tiled: 0
nodatavals: (nan,)
scales: (1.0,)
offsets: (0.0,)
AREA_OR_POINT: Arealandsat_5_da.values
array([[[ 640., 842., 864., ..., 1250., 929., 1111.],
[ 796., 774., 707., ..., 1136., 906., 1065.],
[ 975., 707., 908., ..., 1386., 1249., 1088.],
...,
[1243., 1202., 1160., ..., 1132., 1067., 845.],
[1287., 1334., 1292., ..., 801., 934., 845.],
[1550., 1356., 1314., ..., 1309., 1636., 1199.]],
[[ 810., 1096., 1191., ..., 1749., 1266., 1411.],
[1096., 1048., 905., ..., 1556., 1217., 1411.],
[1286., 1000., 1286., ..., 1749., 1604., 1411.],
...,
[1752., 1565., 1566., ..., 1502., 1456., 1078.],
[1752., 1799., 1706., ..., 983., 1172., 1077.],
[1893., 1753., 1754., ..., 1736., 2250., 1736.]],
[[1007., 1345., 1471., ..., 2102., 1462., 1719.],
[1260., 1259., 1175., ..., 1847., 1419., 1719.],
[1555., 1175., 1555., ..., 2059., 1889., 1760.],
...,
[2090., 1840., 1798., ..., 1828., 1703., 1242.],
[2048., 2049., 2008., ..., 1074., 1326., 1158.],
[2216., 1965., 2049., ..., 2202., 2783., 1994.]],
[[1221., 1662., 1809., ..., 2401., 1660., 1957.],
[1564., 1465., 1367., ..., 2105., 1610., 1907.],
[2004., 1465., 1955., ..., 2302., 2055., 2006.],
...,
[2429., 2138., 2041., ..., 2175., 1885., 1301.],
[2381., 2333., 2382., ..., 1204., 1495., 1301.],
[2478., 2041., 2333., ..., 2755., 3431., 2223.]],
[[1819., 2596., 2495., ..., 2429., 1785., 2023.],
[2259., 2359., 1885., ..., 2158., 1684., 1921.],
[2865., 2291., 2664., ..., 2057., 1955., 2057.],
...,
[3081., 2679., 2612., ..., 2499., 2098., 1395.],
[2779., 2544., 2779., ..., 1429., 1596., 1496.],
[3183., 2309., 2679., ..., 3067., 3802., 2665.]],
[[1682., 2215., 2070., ..., 2072., 1440., 1780.],
[1876., 1973., 1633., ..., 1926., 1586., 1635.],
[2409., 1924., 2215., ..., 1829., 1780., 1829.],
...,
[2585., 2296., 2296., ..., 2093., 1710., 1134.],
[2393., 2344., 2489., ..., 1182., 1374., 1326.],
[2826., 2007., 2393., ..., 2860., 3724., 2333.]]])
Use the attrs attribute
landsat_5_da.attrs
{'transform': (150.0, 0.0, 332325.0, 0.0, -150.0, 4309275.0),
'crs': '+init=epsg:32611',
'res': (150.0, 150.0),
'is_tiled': 0,
'nodatavals': (nan,),
'scales': (1.0,),
'offsets': (0.0,),
'AREA_OR_POINT': 'Area'}
landsat_5_da.crs
'+init=epsg:32611'
Problem: they appear to cover different areas
See week 4's lecture for a reminder!
Hints
earthpy package to do the plotting.data.sel(band=bands) syntax to select specific bands from the dataset, where bands is a list of the desired bands to be selected.import earthpy.plot as ep
# Get the RGB data from landsat 8 dataset as a numpy array
rgb_data_8 = landsat_8_da.sel(band=[4,3,2]).values
# # Initialize
fig, ax = plt.subplots(figsize=(10,10))
# # Plot the RGB bands
ep.plot_rgb(rgb_data_8, rgb=(0, 1, 2), ax=ax);
# Get the RGB data from landsat 5 dataset as a numpy array
rgb_data_5 = landsat_5_da.sel(band=[3,2,1]).values
# # Initialize
fig, ax = plt.subplots(figsize=(10,10))
# # Plot the RGB bands
ep.plot_rgb(rgb_data_5, rgb=(0, 1, 2), ax=ax);
Yes, we can use xarray builtin selection functionality!
The Landsat 5 images is more zoomed in, so let's trim to the Landsat 8 data to this range
Take advantage of the "coordinate" arrays provided by xarray:
landsat_5_da.x
<xarray.DataArray 'x' (x: 300)> array([332400., 332550., 332700., ..., 376950., 377100., 377250.]) Coordinates: * x (x) float64 3.324e+05 3.326e+05 3.327e+05 ... 3.771e+05 3.772e+05
landsat_5_da.y
<xarray.DataArray 'y' (y: 300)> array([4309200., 4309050., 4308900., ..., 4264650., 4264500., 4264350.]) Coordinates: * y (y) float64 4.309e+06 4.309e+06 4.309e+06 ... 4.264e+06 4.264e+06
Remember: These coordinates are in units of meters!
# x bounds
xmin = landsat_5_da.x.min()
xmax = landsat_5_da.x.max()
# y bounds
ymin = landsat_5_da.y.min()
ymax = landsat_5_da.y.max()
We can use Python's built-in slice() function to slice the data's coordinate arrays and select the subset of the data we want.
More info: http://xarray.pydata.org/en/stable/indexing.html
Syntax: slice(start, end, step)
More info: https://www.w3schools.com/python/ref_func_slice.asp
letters = ["a", "b", "c", "d", "e", "f", "g", "h"]
letters[0:3]
['a', 'b', 'c']
letters[slice(0, 3)]
['a', 'b', 'c']
We can use the .sel() function to slice our x and y coordinate arrays!
# slice the x and y coordinates
slice_x = slice(xmin, xmax)
slice_y = slice(ymax, ymin) # IMPORTANT: y coordinate array is in descending order
# Use the .sel() to slice
landsat_8_trimmed = landsat_8_da.sel(x=slice_x, y=slice_y)
y coordinate is stored in descending order, so the slice should be ordered the same way (from ymax to ymin)¶landsat_8_da.y
<xarray.DataArray 'y' (y: 286)> array([4320900., 4320690., 4320480., ..., 4261470., 4261260., 4261050.]) Coordinates: * y (y) float64 4.321e+06 4.321e+06 4.32e+06 ... 4.261e+06 4.261e+06
landsat_8_trimmed.shape
(7, 214, 213)
landsat_8_da.shape
(7, 286, 286)
landsat_8_trimmed
<xarray.DataArray (band: 7, y: 214, x: 213)>
array([[[ 730., 721., 703., ..., 970., 1059., 520.],
[ 700., 751., 656., ..., 835., 1248., 839.],
[ 721., 754., 796., ..., 1065., 1207., 1080.],
...,
[1022., 949., 983., ..., 516., 598., 676.],
[ 976., 757., 769., ..., 950., 954., 634.],
[ 954., 1034., 788., ..., 1133., 662., 1055.]],
[[ 874., 879., 858., ..., 1157., 1259., 609.],
[ 860., 919., 814., ..., 968., 1523., 983.],
[ 896., 939., 982., ..., 1203., 1412., 1292.],
...,
[1243., 1157., 1212., ..., 604., 680., 805.],
[1215., 953., 949., ..., 1095., 1110., 755.],
[1200., 1258., 978., ..., 1340., 758., 1189.]],
[[1181., 1148., 1104., ..., 1459., 1633., 775.],
[1154., 1223., 1093., ..., 1220., 1851., 1345.],
[1198., 1258., 1309., ..., 1531., 1851., 1674.],
...,
...
...,
[2652., 2459., 2649., ..., 1190., 1299., 1581.],
[2547., 1892., 2212., ..., 2284., 2416., 1475.],
[2400., 2627., 2405., ..., 2469., 1579., 2367.]],
[[3039., 2806., 2877., ..., 1965., 2367., 1203.],
[2779., 2799., 2474., ..., 1685., 2339., 1637.],
[2597., 2822., 2790., ..., 2030., 2587., 2116.],
...,
[3144., 2892., 3168., ..., 1453., 1594., 1765.],
[3109., 2405., 2731., ..., 2804., 3061., 1653.],
[2518., 3110., 3144., ..., 2801., 2051., 2770.]],
[[2528., 2326., 2417., ..., 1748., 2139., 1023.],
[2260., 2238., 1919., ..., 1519., 2096., 1448.],
[2041., 2226., 2247., ..., 1848., 2380., 1973.],
...,
[2661., 2423., 2556., ..., 1225., 1335., 1469.],
[2573., 1963., 2091., ..., 2479., 2570., 1393.],
[2191., 2525., 2504., ..., 2658., 1779., 2514.]]])
Coordinates:
* band (band) int64 1 2 3 4 5 6 7
* y (y) float64 4.309e+06 4.309e+06 4.309e+06 ... 4.265e+06 4.264e+06
* x (x) float64 3.326e+05 3.328e+05 3.33e+05 ... 3.769e+05 3.771e+05
Attributes:
transform: (210.0, 0.0, 318195.0, 0.0, -210.0, 4321005.0)
crs: +init=epsg:32611
res: (210.0, 210.0)
is_tiled: 0
nodatavals: (nan,)
scales: (1.0,)
offsets: (0.0,)
AREA_OR_POINT: Area# Get the trimmed landsat 8 RGB data as a numpy array
rgb_data_8 = landsat_8_trimmed.sel(band=[4,3,2]).values
# # Initialize
fig, ax = plt.subplots(figsize=(10,10))
# # Plot the RGB bands
ep.plot_rgb(rgb_data_8, rgb=(0, 1, 2), ax=ax);
# Select the RGB data as a numpy array
rgb_data_5 = landsat_5_da.sel(band=[3,2,1]).values
# Initialize the figure
fig, ax = plt.subplots(figsize=(10,10))
# Plot the RGB bands
ep.plot_rgb(rgb_data_5, rgb=(0, 1, 2), ax=ax);
Yes! We'll use the change in the NDVI over time to detect change in lake volume
For Landsat 5: NIR = band 4 and Red = band 3
For Landsat 8: NIR = band 5 and Red = band 4
NIR_1988 = landsat_5_da.sel(band=4)
RED_1988 = landsat_5_da.sel(band=3)
NDVI_1988 = (NIR_1988 - RED_1988) / (NIR_1988 + RED_1988)
NIR_2017 = landsat_8_da.sel(band=5)
RED_2017 = landsat_8_da.sel(band=4)
NDVI_2017 = (NIR_2017 - RED_2017) / (NIR_2017 + RED_2017)
hvplot.image() function to show the differencecoolwarm, is particularly useful for this situationdiff = NDVI_2017 - NDVI_1988
diff.hvplot.image(
x="x",
y="y",
frame_height=400,
frame_width=400,
cmap="coolwarm",
clim=(-1, 1),
geo=True,
crs=32611,
)
NDVI_1988.shape
(300, 300)
NDVI_2017.shape
(286, 286)
diff.shape
(42, 43)
# Different x/y ranges!
print(NDVI_1988.x[0].values)
print(NDVI_2017.x[0].values)
332400.0 318300.0
# Different resolutions
print((NDVI_1988.x[1] - NDVI_1988.x[0]).values)
print((NDVI_2017.x[1] - NDVI_2017.x[0]).values)
150.0 210.0
# The center lat/lng values in EPSG = 4326
# I got these points from google maps
x0 = -118.7081
y0 = 38.6942
Let's convert these coordinates to the sames CRS as the Landsat data
cartopy to the rescue!
import cartopy.crs as ccrs
# Initialize a CRS object for our EPSG=32611 CRS
crs = ccrs.epsg("32611")
# Convert from (x0,y0) in 4326 to (x_center, y_center) in 32611
x_center, y_center = crs.transform_point(x0, y0, ccrs.PlateCarree())
Remember: we are converting from EPSG=4326 (PlateCarree) to EPSG=32611
Let's add a 30 km bounding box around the midpoint
buffer = 15e3 # in km
xmin = x_center - buffer
xmax = x_center + buffer
ymin = y_center - buffer
ymax = y_center + buffer
bounding_box = [[[xmin, ymin], [xmin, ymax], [xmax, ymax], [xmax, ymin]]]
Use the builting geoviews imagery to confirm..
EsriImagery * gv.Polygons(bounding_box, crs=crs).options(alpha=0.4)
res = 200 # 200 meter resolution
x = np.arange(xmin, xmax, res)
y = np.arange(ymin, ymax, res)
This does a linear interpolation of the data using the nearest pixels.
NDVI_2017_regridded = NDVI_2017.interp(x=x, y=y)
NDVI_1988_regridded = NDVI_1988.interp(x=x, y=y)
img1988 = NDVI_1988_regridded.hvplot.image(
x="x",
y="y",
crs=32611,
geo=True,
frame_height=300,
frame_width=300,
clim=(-3, 1),
cmap="fire",
title="1988"
)
img2017 = NDVI_2017_regridded.hvplot.image(
x="x",
y="y",
crs=32611,
geo=True,
frame_height=300,
frame_width=300,
clim=(-3, 1),
cmap="fire",
title="2017"
)
img1988 + img2017
Now that the images are lined up, the change in lake volume is clearly apparent
diff_regridded = NDVI_2017_regridded - NDVI_1988_regridded
diff_regridded
<xarray.DataArray (y: 150, x: 150)>
array([[ 0.07285845, 0.02394533, 0.02255797, ..., 0.022373 ,
0.02654387, 0.01870564],
[ 0.09309169, 0.11359612, 0.16240596, ..., 0.0184576 ,
0.01718405, 0.01623927],
[ 0.1039483 , 0.15174163, 0.14382052, ..., 0.01866696,
0.01353435, 0.0147996 ],
...,
[-0.10564601, 0.01861015, 0.0668057 , ..., 0.01163675,
0.01841919, 0.00968276],
[ 0.01748623, -0.06021414, 0.03493955, ..., 0.03757851,
0.02141323, 0.03405552],
[ 0.0131126 , 0.08405171, -0.02384596, ..., 0.01839516,
0.00968287, 0.02835872]])
Coordinates:
* x (x) float64 3.365e+05 3.367e+05 3.369e+05 ... 3.661e+05 3.663e+05
* y (y) float64 4.269e+06 4.269e+06 4.27e+06 ... 4.299e+06 4.299e+06diff_regridded.hvplot.image(
x="x",
y="y",
crs=32611,
geo=True,
frame_height=400,
frame_width=400,
cmap="coolwarm",
clim=(-1, 1),
)
Given hi-resolution data, we can downsample to a lower resolution with the familiar groupby / mean framework from pandas
Let's try a resolution of 1000 meters instead of 200 meters
# Define a low-resolution grid
res_1000 = 1000
x_1000 = np.arange(xmin, xmax, res_1000)
y_1000 = np.arange(ymin, ymax, res_1000)
# Groupby new bins and take the mean of all pixels falling into a group
# First: groupby low-resolution x bins and take mean
# Then: groupby low-resolution y bins and take mean
diff_res_1000 = (
diff_regridded.groupby_bins("x", x_1000, labels=x_1000[:-1])
.mean(dim="x")
.groupby_bins("y", y_1000, labels=y_1000[:-1])
.mean(dim="y")
.rename(x_bins="x", y_bins="y")
)
diff_res_1000
<xarray.DataArray (y: 29, x: 29)>
array([[ 0.06869101, 0.13177651, 0.15469471, 0.07722272, 0.08874126,
0.07121849, 0.03444372, 0.037119 , 0.00104902, 0.04373775,
0.06956536, 0.11455193, 0.1385741 , 0.11434505, 0.06491517,
0.06781426, 0.05570124, 0.05596228, 0.02769747, 0.02554617,
0.03236267, 0.01968282, 0.02387923, 0.01797904, 0.01902946,
0.02166195, 0.02174784, 0.01880276, 0.01359409],
[ 0.08346992, 0.08776704, 0.13548499, 0.08220051, 0.09708871,
0.06775058, 0.03779001, 0.01991329, 0.07547025, 0.13493956,
0.20133291, 0.12017601, 0.07325837, 0.05940746, 0.07004369,
0.0447765 , 0.04581451, 0.03409407, 0.02344497, 0.03093005,
0.02869954, 0.03271979, 0.02381046, 0.01556579, 0.02169905,
0.0227847 , 0.02022507, 0.018137 , 0.01564086],
[ 0.0936053 , 0.09289295, 0.12060025, 0.12873584, 0.16235808,
0.10101415, 0.04018545, 0.01535085, 0.07127819, 0.16983906,
0.10893062, 0.21828906, 0.08079256, 0.07577124, 0.06833529,
0.01555701, -0.01109825, -0.00299437, 0.01431601, 0.02801818,
0.0293123 , 0.02606464, 0.02491255, 0.01810481, 0.01536261,
0.02188127, 0.02200607, 0.01673377, 0.00776444],
[ 0.08717537, 0.07700656, 0.13392983, 0.14826298, 0.15392851,
0.09926764, 0.0569021 , 0.0610834 , 0.08476449, 0.08527878,
...
0.02798621, 0.02992438, 0.03641541, 0.02005467, 0.02204858,
0.0287752 , 0.02188122, 0.02495601, 0.02506901],
[ 0.05668256, 0.07664267, 0.08830451, 0.07859942, 0.12379519,
0.114724 , 0.0405303 , 0.03885428, 0.06682004, 0.05558355,
0.03833086, 0.02902788, 0.17875198, 0.20029484, 0.01872939,
0.02381692, 0.02002493, 0.0218631 , 0.02401623, 0.02320183,
0.03253913, 0.01651287, 0.02256299, 0.03535588, 0.03159138,
0.02213577, 0.02914387, 0.029847 , 0.02951869],
[ 0.0616654 , 0.04362798, 0.06757889, 0.13201092, 0.10358936,
0.08289852, 0.10031057, 0.07410737, 0.06310736, 0.0420152 ,
0.04179894, 0.02750726, -0.05992471, 0.02938139, 0.03474508,
0.02488141, 0.02266253, 0.02158754, 0.0207588 , 0.02306504,
0.02320071, 0.02288257, 0.02793184, 0.02504522, 0.02886683,
0.022406 , 0.01791923, 0.0284036 , 0.02830287],
[ 0.04925822, 0.07870229, 0.1287353 , 0.13487135, 0.07996429,
0.09300206, 0.07809081, 0.07023948, 0.04476815, 0.03548731,
0.03374816, -0.04190856, -0.00502124, 0.03805152, 0.03067927,
0.02737551, 0.02609662, 0.02289044, 0.02446029, 0.02962656,
0.0284653 , 0.02105816, 0.03050465, 0.02811649, 0.02367856,
0.03380146, 0.02753168, 0.02918184, 0.02360341]])
Coordinates:
* x (x) float64 3.365e+05 3.375e+05 3.385e+05 ... 3.635e+05 3.645e+05
* y (y) float64 4.269e+06 4.27e+06 4.271e+06 ... 4.296e+06 4.297e+06diff_res_1000.hvplot.image(
x="x",
y="y",
crs=32611,
geo=True,
frame_width=500,
frame_height=400,
cmap="coolwarm",
clim=(-1, 1),
)
band_info = pd.DataFrame([
(1, "Aerosol", " 0.43 - 0.45", 0.440, "30", "Coastal aerosol"),
(2, "Blue", " 0.45 - 0.51", 0.480, "30", "Blue"),
(3, "Green", " 0.53 - 0.59", 0.560, "30", "Green"),
(4, "Red", " 0.64 - 0.67", 0.655, "30", "Red"),
(5, "NIR", " 0.85 - 0.88", 0.865, "30", "Near Infrared (NIR)"),
(6, "SWIR1", " 1.57 - 1.65", 1.610, "30", "Shortwave Infrared (SWIR) 1"),
(7, "SWIR2", " 2.11 - 2.29", 2.200, "30", "Shortwave Infrared (SWIR) 2"),
(8, "Panc", " 0.50 - 0.68", 0.590, "15", "Panchromatic"),
(9, "Cirrus", " 1.36 - 1.38", 1.370, "30", "Cirrus"),
(10, "TIRS1", "10.60 - 11.19", 10.895, "100 * (30)", "Thermal Infrared (TIRS) 1"),
(11, "TIRS2", "11.50 - 12.51", 12.005, "100 * (30)", "Thermal Infrared (TIRS) 2")],
columns=['Band', 'Name', 'Wavelength Range (µm)', 'Nominal Wavelength (µm)', 'Resolution (m)', 'Description']).set_index(["Band"])
band_info
| Name | Wavelength Range (µm) | Nominal Wavelength (µm) | Resolution (m) | Description | |
|---|---|---|---|---|---|
| Band | |||||
| 1 | Aerosol | 0.43 - 0.45 | 0.440 | 30 | Coastal aerosol |
| 2 | Blue | 0.45 - 0.51 | 0.480 | 30 | Blue |
| 3 | Green | 0.53 - 0.59 | 0.560 | 30 | Green |
| 4 | Red | 0.64 - 0.67 | 0.655 | 30 | Red |
| 5 | NIR | 0.85 - 0.88 | 0.865 | 30 | Near Infrared (NIR) |
| 6 | SWIR1 | 1.57 - 1.65 | 1.610 | 30 | Shortwave Infrared (SWIR) 1 |
| 7 | SWIR2 | 2.11 - 2.29 | 2.200 | 30 | Shortwave Infrared (SWIR) 2 |
| 8 | Panc | 0.50 - 0.68 | 0.590 | 15 | Panchromatic |
| 9 | Cirrus | 1.36 - 1.38 | 1.370 | 30 | Cirrus |
| 10 | TIRS1 | 10.60 - 11.19 | 10.895 | 100 * (30) | Thermal Infrared (TIRS) 1 |
| 11 | TIRS2 | 11.50 - 12.51 | 12.005 | 100 * (30) | Thermal Infrared (TIRS) 2 |
We'll be downloading publicly available Landsat data from Google Cloud Storage
See: https://cloud.google.com/storage/docs/public-datasets/landsat
The relevant information is stored in our intake catalog:
From our catalog.yml file:
google_landsat_band:
description: Landsat bands from Google Cloud Storage
driver: rasterio
parameters:
path:
description: landsat path
type: int
row:
description: landsat row
type: int
product_id:
description: landsat file id
type: str
band:
description: band
type: int
args:
urlpath: https://storage.googleapis.com/gcp-public-data-landsat/LC08/01/{{ '%03d' % path }}/{{ '%03d' % row }}/{{ product_id }}/{{ product_id }}_B{{ band }}.TIF
chunks:
band: 1
x: 256
y: 256
From the "urlpath" above, you can see we need "path", "row", and "product_id" variables to properly identify a Landsat image:
The path and row corresponding to the area over Philadelphia has already been selected using the Earth Explorer. This tool was also used to find the id of the particular date of interest using the same tool.
# Necessary variables
path = 14
row = 32
product_id = 'LC08_L1TP_014032_20160727_20170222_01_T1'
We won't cover the specifics in the course, but geemap is a fantastic library for interacting with GEE.
geemap package:
This will return a specific Landsat band as a dask array.
from random import random
from time import sleep
def get_band_with_exponential_backoff(path, row, product_id, band, maximum_backoff=32):
"""
Google Cloud Storage recommends using exponential backoff
when accessing the API.
https://cloud.google.com/storage/docs/exponential-backoff
"""
n = backoff = 0
while backoff < maximum_backoff:
try:
return cat.google_landsat_band(
path=path, row=row, product_id=product_id, band=band
).to_dask()
except:
backoff = min(2 ** n + random(), maximum_backoff)
sleep(backoff)
n += 1
Loop over each band, load that band using the above function, and then concatenate the results together..
bands = [1, 2, 3, 4, 5, 6, 7, 9, 10, 11]
datasets = []
for band in bands:
da = get_band_with_exponential_backoff(
path=path, row=row, product_id=product_id, band=band
)
da = da.assign_coords(band=[band])
datasets.append(da)
ds = xr.concat(datasets, "band", compat="identical")
ds
<xarray.DataArray (band: 10, y: 7871, x: 7741)>
dask.array<concatenate, shape=(10, 7871, 7741), dtype=uint16, chunksize=(1, 256, 256), chunktype=numpy.ndarray>
Coordinates:
* band (band) int64 1 2 3 4 5 6 7 9 10 11
* y (y) float64 4.582e+06 4.582e+06 4.582e+06 ... 4.346e+06 4.346e+06
* x (x) float64 3.954e+05 3.954e+05 3.955e+05 ... 6.276e+05 6.276e+05
Attributes:
transform: (30.0, 0.0, 395385.0, 0.0, -30.0, 4582215.0)
crs: +init=epsg:32618
res: (30.0, 30.0)
is_tiled: 1
nodatavals: (nan,)
scales: (1.0,)
offsets: (0.0,)
AREA_OR_POINT: PointCRS is EPSG=32618
def load_google_landsat_metadata(path, row, product_id):
"""
Load Landsat metadata for path, row, product_id from Google Cloud Storage
"""
def parse_type(x):
try:
return eval(x)
except:
return x
baseurl = "https://storage.googleapis.com/gcp-public-data-landsat/LC08/01"
suffix = f"{path:03d}/{row:03d}/{product_id}/{product_id}_MTL.txt"
df = intake.open_csv(
urlpath=f"{baseurl}/{suffix}",
csv_kwargs={
"sep": "=",
"header": None,
"names": ["index", "value"],
"skiprows": 2,
"converters": {"index": (lambda x: x.strip()), "value": parse_type},
},
).read()
metadata = df.set_index("index")["value"]
return metadata
metadata = load_google_landsat_metadata(path, row, product_id)
metadata.head()
index ORIGIN Image courtesy of the U.S. Geological Survey REQUEST_ID 0501702206266_00020 LANDSAT_SCENE_ID LC80140322016209LGN01 LANDSAT_PRODUCT_ID LC08_L1TP_014032_20160727_20170222_01_T1 COLLECTION_NUMBER 01 Name: value, dtype: object
slice() function discussed last example!
sel() function)y coordinates are in descending order, so you'll slice will need to be in descending order as well
This should take about a minute or so, depending on the speed of your laptop.
# City limits plot
limits_plot = city_limits.hvplot.polygons(
line_color="white", alpha=0, line_alpha=1, crs=32618, geo=True
)
# Landsat plot
landsat_plot = subset.sel(band=3).hvplot.image(
x="x",
y="y",
rasterize=True,
cmap="viridis",
frame_width=500,
frame_height=500,
geo=True,
crs=32618,
)
# Combined
landsat_plot * limits_plot
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Input In [60], in <cell line: 2>() 1 # City limits plot ----> 2 limits_plot = city_limits.hvplot.polygons( 3 line_color="white", alpha=0, line_alpha=1, crs=32618, geo=True 4 ) 6 # Landsat plot 7 landsat_plot = subset.sel(band=3).hvplot.image( 8 x="x", 9 y="y", (...) 15 crs=32618, 16 ) NameError: name 'city_limits' is not defined
Remember, NDVI = (NIR - Red) / (NIR + Red)
band_info.head()
NIR is band 5 and Red is band 4
# Selet the bands we need
NIR = subset.sel(band=5)
RED = subset.sel(band=4)
# Calculate the NDVI
NDVI = (NIR - RED) / (NIR + RED)
NDVI = NDVI.where(NDVI < np.inf)
NDVI.head()
# NEW: Use datashader to plot the NDVI
p = NDVI.hvplot.image(
x="x",
y="y",
geo=True,
crs=32618,
datashade=True, # NEW: USE DATASHADER
project=True, # NEW: PROJECT THE DATA BEFORE PLOTTING
frame_height=500,
frame_width=500,
cmap="viridis",
)
# NEW: Use a transparent rasterized version of the plot for hover text
# No hover text available with datashaded images so we can use the rasterized version
p_hover = NDVI.hvplot(
x="x",
y="y",
crs=32618,
geo=True,
rasterize=True,
frame_height=500,
frame_width=500,
alpha=0, # SET ALPHA=0 TO MAKE THIS TRANSPARENT
colorbar=False,
)
# City limits
limits_plot = city_limits.hvplot.polygons(
geo=True, crs=32618, line_color="white", line_width=2, alpha=0, line_alpha=1
)
p * p_hover * limits_plot
datashade=True keyword